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Abstract 

Several attempts have been made to illustrate the organization of the monolingual mental lexicon and each 
model proposed so far has highlighted different aspects of lexical processing. What they have in common is the 
fact that their depictions rely on single lexical items and paradigmatic relations come to the fore in their 
explanations. Hoey’s lexical priming theory (2005) tries to shed light on the issue of collocational processing in 
the internal lexicon from a cognitive and psycholinguistic perspective and its importance for our overall creative 
language production. A number of psycholinguistic studies have tested Hoey's theory as it relates to English, but 
work in other languages is limited. The present study broadens the scope of work in this area by investigating 
whether collocational priming also holds for speakers of Turkish. Furthermore, the possible influence of 
frequency and part of speech on collocational priming is scrutinized by exploring the correlations between 
response times in the priming experiment and these independent variables. The findings revealed a significant 
collocational priming effect for Turkish LI users, in line with Hoey’s claims. The regression analysis indicated 
frequency and part of speech as important predictors of processing duration. The correlation analysis also 
showed significant correlations between the response times and both word and collocational frequency. A 
tentative mental lexicon framework is proposed based on the findings of this research. 

©2017 JLLS and the Authors - Published by JLLS. 
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1. Introduction and Literature Review 

As Sinclair (1991) and Hoey (2005) state, to shed on light on the principles behind the processing 
and acquisition of collocations, we need to look at them within a broader perspective of formulaic 
language as a whole. 

‘Formulaic language’ has been defined as ‘recurrent multi-word lexical items having a single 
meaning or function’ and it is generally employed as an umbrella term for idioms, collocations, lexical 
bundles etc. (Schmitt, 2010). Writers have addressed the issue of formulaic language in many different 
ways and used different terms, often in inconsistent ways (Wray, 2002). Many researchers (e.g. Wray, 
2002; Schmitt, 2010) acknowledge that formulaic language is one of the key components of language 


* 


Corresponding author. Tel.: +90-312 485 05 07 
E-mail address : hakancangir@gmail.com 









466 


Hakan Cangir et al. / Journal of Language and Linguistic Studies, 13(2) (2017) 465-486 


mainly because of its pervasiveness in language use. Furthermore, meanings and functions are 
achieved by dint of formulaic language and the language users producing formulaic phrases in 
language production enjoys a processing advantage (Conklin and Schmitt, 2012). 

The reason why researchers concentrate on formulaic language emerges from the viewpoint that 
formulas are basic language units (e.g. Conklin and Schmitt, 2012). This theoretical stance is affected 
by Sinclair’s (1991) idiom principle and by pattern grammar (Hunston and Francis, 2000), and 
construction grammar (Goldberg, 2006). Sinclair claims that a language user knows a huge number of 
semi-preconstructed phrases, many of which are uttered in speech and can be observed in texts. It is 
even estimated that about half of fluent native text is shaped based on idiom principle. 

Another rationale comes from the theoretical position that formulas seem to have a unique 
psycholinguistic status and that they have a vital role in language acquisition (Schmitt, 2010). The 
investigation of formulaic language is of importance due to the fact that there may be a link between 
the learners’ use of formulaic language and their perceived proficiency in language (e.g. Staples, 
Egbert, Biber and McClair, 2013), though no conclusive results have been observed based on 
empirical research. However, it has been concluded by many researchers that formulaic sequences, 
statistically defined and extracted from a large and balanced corpora have indications for educational 
and psycholinguistic research and applications (Ellis and Simpson-Vlach, 2009). 

Given that formulaic language plays an important role in language processing and language 
acquisition and that collocations are regarded as a sub-category of this group, the current research, 
which investigates collocational priming in Turkish, approaches the issue of lexical processing from a 
syntagmatic perspective and attempts to come up with a tentative framework for the structuring of 
collocations in the internal lexicon. 

As stated by Cruse (2000), the vocabulary of language is comprised of two main relations, which 
are paradigmatic and syntagmatic links. Based on this organization, collocations can be depicted under 
the syntagmatic branch together with other multi-word units, whereas synonyms, antonyms and 
hyponyms are classified in the paradigmatic end. 

In addition to where collocations stand in the vocabulary knowledge organization, the definition of 
the term is also an important issue to consider and has been a controversial phenomenon in 
psycholinguistic, corpus linguistic and language acquisition research. 

Firth (1957), who is considered as one of the first linguists to use the term collocation in its modern 
linguistic sense, says: 

Meaning by collocation is an abstraction at the syntagmatic level and is not directly 
concerned with the conceptual or idea approach to the meaning of words. One of the 
meanings of night is its collocability with dark, and, of dark, of course, collocation with 
night. (Firth, 1957: 196) 

As is discussed in the previous section, collocations are commonly seen as a subcategory of 
formulaic language (Wray, 2002). Notwithstanding their apparently prevalent use in language, 
collocations are difficult to define (Wolter & Yamashita, 2014). Two commonly accepted approaches 
to the definition can be observed in the literature. The first one, the phraseological approach (Cowie, 
1994; Howarth, 1998), asserts that a word cluster can be considered a genuine collocation on condition 
that one of the words in the cluster is non-compositional (i.e. non-transparent or opaque), which makes 
the combination semi-transparent. If both the members of the combination are fully compositional, the 
item is then called a 'free combination' (as in “brush teeth”) as far as the phraseological approach is 
concerned. If both the members are non-transparent or opaque, the cluster is named as 'idiom' (as in 
“kick the bucket”). Benson et al. (1986) stated word combinations are grouped according to three 
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principal criteria; the level of cohesiveness, semantic transparency and frequency. The basic problem 
with the classification provided by the phraseological approach is the fact that it is challenging to 
decide on the boundaries between these categories. 

The second acknowledged approach has close links with corpus linguistics and employs statistical 
measures to investigate the frequency of the co-occurrence of certain word patterns (Sinclair, 1991). 
The rationale behind the frequency approach originates from the idea that the more frequent word 
combinations exist together in written or spoken language, the more likely they are to be entrenched in 
the mental lexicon and can be seen as collocations. Native speakers of the language and even some 
advanced second language users produce these word combinations automatically and they enjoy a 
processing advantage, which eventually affects their fluency. According to Henriksen (2013), 
integrating the corpus approach into research appears to be logical because then you rely on objective 
criteria, such as frequency, range and span, rather than your own intuition about word pairs. As for the 
problems regarding this approach, as Howarth (1998) states it focuses on performance and take no 
notice of competence. Extracting word pairs from corpora based on frequency measures without 
paying attention to semantics could reveal word pairs that native speakers would not consider as a 
collocation, as in the case of English definite article ‘the’. It appears to collocate with all the nouns due 
to its pervasive use in language and if researchers rely on corpus data only, the frequency measures are 
likely to misguide them in their analysis and interpretation if the primary aim is to explore the 
collocational processing in the mental lexicon. In other words, without considering the semantic 
aspect, corpus extracted word pairs tend to lack strong psycholinguistic legitimacy for the language 
users. 

Given that each approach has its strengths and weaknesses, the current research applied both the 
strategies as complementary methods, in line with some earlier research (see Nesselhauf, 2005 for a 
discussion). Therefore, according to the current research, in order for a word combination to be 
considered as a collocation, it must be frequent at a certain level (benchmarks are given in the 
methodology section) and semi-transparent, an approach that was employed by some earlier research 
(e.g. Kjellmer, 1984; Kjellmer, 1987). As this study was conducted to set a baseline for a cross- 
linguistic investigation, the lexical items were adopted from the main experiment. Recurrent word 
combinations in two balanced corpora (Corpus of Contemporary American English and Turkish 
National Corpus) were detected with the help of association measures, which will be discussed in 
more details in the methodology section. After that, the list of collocations was fine-tuned based on 
their semantic features (i.e. compositionality). We believe that this mixed approach employed in 
deciding the word combinations to be used in the experiment was a sound move considering the pros 
and cons of each approach and their complementary nature. 

The discussion so far have tried to shed light on the basic concepts, formulaic language and 
collocations to provide some basic insight into syntagmatic relations between words. The core 
paradigm employed in the study also needs explaining before giving details about the methodology. 

Firth’s famous saying “you shall know a word by the company it keeps” has been used and adopted 
by many linguists and the philosophy behind this notion has been discussed and enhanced in many 
aspects over the years. Having its roots in Firthian tradition, a new theory of lexical priming was 
proposed by Hoey (2005). The theory asserts that every word is mentally primed for collocational use 
and collocational priming is sensitive to the contexts where the lexical unit is encountered. The fact 
that a lexical item is employed in specific combinations in particular types of texts constitutes part of 
our knowledge of that lexical unit. According to his definition of the term, collocation: 
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“Collocation is a psychological association between words which is evidenced by their 
occurrence together in a corpora more frequently than is rational in terms of random 
distribution” (2005, pp. 3-5) 

Hoey (2005) further claims that priming can also be seen as the source of our creative language 
system. According to him, the grammatical categories assigned to lexical units are determined by 
lexically specific patterns of priming rather than an independently existing grammar. This view is in 
accord with the usage-based models, which are closely linked with Cognitive Linguistics and 
Construction Grammar (Barlow & Kemmer, 2000). The cognitive view of language postulates that 
language learning emerges from general practices of human inductive reasoning being applied to the 
specific problem of language (Tomasello, 2003). Unlike the Chomskyan view of language, 
cognitivists assert language acquisition device per se does not exist. Rather, language goes hand in 
hand with other cognitive processes though its cognitive content could vary. In addition, cognitive 
view of language posits that genes do not appear to be the mere source of language. On the contrary, 
the language emerges from the structure of adult language and the structure of social and cognitive 
skills (Ellis, 2001). 

Considering his views and stance, one can deduce that Hoey is at odds with Generative Grammar 
(Chomsky, 1965) and approaches the issue from a psycholinguistic perspective. According to the 
Chomskyan view of language, the principal goal of linguistics is to investigate speakers’ competence, 
which is also defined as the abstract system of linguistic knowledge, rather than linguistic 
performance. Chomsky is interested in the internalized (i-) language, not the externalized (e-) 
language. On the contrary, what Hoey and Sinclair concentrate on is the exploration of e- language 
enhanced by corpora. Sinclair states that scrutinizing competence and disregarding real life language 
in an attempt to escape the noise or the disorganization in language use does not make sense as the 
larger-scale corpora these days are powerful enough to help researchers to get a clear picture of real 
language use and find significant patterns of various language phenomena (1991, p. 103). 

On the whole, Hoey thinks all the priming forms; lexical, textual, grammatical etc. accumulate as 
one is exposed to the real language around him. Because we have different language learning 
experiences, the priming effect can differ slightly for each person. However, those minor variations 
appear to be adjusted in time as we have more exposure since there needs to be some standards so that 
language users can comprehend each other through a common use of lexical units (2005, p. 9). These 
standards he says include education, traditions, the mass media and reference works like dictionaries 
(2005, pp. 181-182). 

Hoey accepts that priming might harbour some conflicts. A basic example can be observed in the 
rules that are taught at school or in grammar books which seem to contradict with native speaker 

intuition. To give an example from Turkish, we can think of the “neither .... nor ...” (tie . ne de....) 

situation. Considering the negative form of the phrase, native speakers of Turkish are primed to use a 
negative verb at the end of this phrase (Ne annesi ne de bcibasi ona yardim etmedi- “Neither his 
mother nor his father did not help him”) using their native speaker intuition; however, Turkish 
grammar states the opposite (Ne annesi ne de babasi onci yardim ett/-“Neither his mother nor his 
father helped him”), which is the correct grammatical form of the sentence, according to prescriptive 
grammars. 

There are some studies exploiting the collocational priming paradigm, which were conducted to 
find evidence for psycholinguistic notion of priming. Those studies mainly used experimental 
psycholinguistic techniques and tools, such as lexical decision, word naming, semantic association etc. 

In one of those studies by Durrant and Doherty (2010), evidence for collocational priming was 
found and the writers claimed that their findings were partly in line with Hoey’s (2005) lexical 
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priming theory. It was the first research proposing a frequency based collocational priming 
explanation independent of psychological association. However, in their second experiment, they 
found inconsistent results with the first application. The results indicated that there was a priming 
effect for the associated word pairs but not for high frequency collocations. Therefore, they were 
cautious in their interpretation and called for further research. In an earlier study by McKoon and 
Ratcliff (1992), a weak priming effect was detected for high frequency collocations. The researchers 
reported their limitations as a small size corpus and a lack of a psychological association measure. 
Thus, they tentatively suggested a possible priming influence and avoided making strong claims. 

There have been many other attempts to shed light on the processing of collocations in LI and L2. 
Some researchers, Wray (2002, 2008) in particular, claimed that native speakers (NS) process 
collocations or formulaic phrases as chunks, whereas non-native speakers (NNS) decompose the 
whole into its single units to process. However, some others (e.g. Durrant and Schmitt, 2010) 
disagreed with Wray’s stance claiming that NS and NNS do not differ in their approach to the 
acquisition of collocations. Rather, NNS process collocations differently in that they have insufficient 
language input and limited exposure. 

The studies discussed above attempted to test the hypothesis that words are primed to co-occur or 
question if they are stored as chunks in the mental lexicon, an idea different versions of which have 
been proposed and discussed for a long time (e.g. Sinclair, 1987; Ellis 2001; Hoey, 2005). However, 
no research to the researchers’ knowledge to this date has considered a typologically different 
language in its investigation and approached the issue of collocational priming from this angle. Having 
this notion in mind, the writers of the current research seeks to answer the research questions below: 


a- Does collocational priming exist in Turkish? 

b- To what extent does frequency play a role in collocational priming, if any? 


To this end, a monolingual priming experiment including a lexical decision task was designed 
following the standards of the paradigm. The details of the approach are provided in the following 
section. 


2. Method 

2.1. Overall Design 

The application was a lexical decision task including a balanced number of collocations, non¬ 
collocations, and some filler items to balance the proportion of the target items with the control and 
non-word items (with a relatedness proportion of 0.24 and a non-word ratio of 0.27). To be more 
precise, for each collocational item (e.g. soguk sava§ - “cold war”), there was one non-collocation 
with the same target word but a different prime word with the same word length (+/-1) and a similar 
prime word frequency, (e.g. uzak sava§ - “far war”), a filler non-collocation consisting of random 
words with the same target word length (+/-1), (geni§ nefret - “broad hatred”), and a non-word pair 
consisting of a random prime word followed by a non-word made up by the Turkish LI members of 
the research team (e.g. gukur sagit - “hollow sagit"). Additionally, having relatedness proportion and 
non-word ratio concerns, the team came up with ten more non-collocation items and non-word items 
including made-up words (i.e. fillers) with similar word length with the other items. Eventually, only 
the mean response times for the collocate (e.g. soguk sava§ - “cold war”) and corresponding non- 
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collocate items (e.g. uzak sava§ - “far war”) were investigated in the regression and correlation 
analyses and the response times of all the other lexical items were ignored intentionally due to the 
design of the current research. 

The relatedness proportion stands for the ratio of accompanying prime-target lexical items out of 
all the lexical items. It is claimed that the bigger the relatedness proportion is, the stronger the 
semantic priming is (de Groot, 1984). That’s why, a standard level (lower than 0.25) mentioned in 
Jiang (2012) was adopted. The non-word ratio is the proportion of non-words to all the collocational, 
non-collocational items and unrelated word pairs (see Altarriba and Basnight-Brown, 2007 for a 
discussion). 

The stimulus onset asynchrony (SOA), which is described as the time interval between the prime 
word and the onset of the target word, was set to 100 milliseconds to comply with the standards of the 
priming paradigm based on the discussion by Jiang (2012). The remote version of DMDX 1 was used 
in the current research since one of the researchers was abroad during the actual application. The 
research team compiled the priming experiment script together with a simple batch file so that the test 
could ran automatically on each participant’s screen and send the results of the experiment to the team 
as an e-mail. The lexical items were presented at a random order. The subjects were guided through a 
web-interface designed for this research only, which includes all the details about the procedure and 
the necessary steps. Example items from the priming experiment are shown in Table 1: 


Table 1 . A sample DMDX screen 


SCREEN 1 

* 

(500 ms) 

SCREEN 2 

(200 ms) 

SCREEN 3 
prime word 
(100 ms) 

SCREEN 4 
target word 
(response is recorded) 

Item type 

* 

######### 

yapmak 

HATA 

Collocation 

* 

######### 

aimak 

HATA 

Non¬ 

collocation 

* 

######### 

diirtmek 

PAZI 

Filler 

* 

######### 

9 arpmak 

LATi 

Non-word 


After the priming experiment, the subjects took an online end of test questionnaire answering 
questions about vision, dexterity and priming items. They were asked if they were able to consciously 
see the priming items flashed before the target words for 100 milliseconds and whether they detected a 
pattern between the stimulus and the target to make sure the collocational processing was automatic 
and they were not making use of any conscious strategies during lexical processing. 

The output of the lexical decision task and the frequency values (i.e. the difference between the 
mean response times of collocate and non-collocate items only and the relationship between the mean 
response times and frequency measures) were analysed using the Statistical Package for the Social 
Sciences (SPSS) 23 software. 


1 a software developed at Monash University and at the University of Arizona by K. I. Forster and J. 
C. Forster (2003) and provided as an open-source tool 
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2.2. Participants 

41 native speakers of Turkish (27 female and 14 male) took part in the study. Participants were 
either undergraduate students at Ankara University (N=28) or lecturers from different universities in 
Ankara (N=13). They were aged between 18 and 55. 

Several instruments were used during the process. A digit span test was employed in an attempt to 
evaluate the possible participants’ short term memory and make sure that they can keep a lexical item 
they see on a computer screen in their mind for a required period of time. The test is used as a standard 
procedure in psycholinguistic experiments and it was conducted through a simple java application in 
which participants were asked to recollect the numbers presented to them and write them on the screen 
accurately. The application provided a digit span score in the end indicating an overview of the 
participants’ short term verbal memory. All the subjects scored 6 and above in the test and they took 
the monolingual collocational priming experiment, which was conducted with the help of the DMDX 
software. 

2.3. Item development 

An important issue for the current study was to extract the collocational items from corpora based 
on both statistical and semantic aspects, the rationale of which has been discussed in the previous 
section. Because this monolingual experiment was the first step of a cross-linguistic priming study, the 
researchers made use of the Contemporary American English Corpus (COCA), (Davies, 2008-) and 
Turkish National Corpus (TNC), (Aksan et al., 2012) in combination. First, around 70 V+N and 70 
ADJ+N collocations were chosen from the COCA list (Davies, 2008-) of English collocations, which 
provided only the Ml values of all the word combinations on the list as a frequency measure. The t- 
scores of all those collocational items were also computed separately with the help of a spreadsheet 
developed by Philip Durrant. The chosen collocations were required to have an Ml score of at least 3.0 
and a t-score of 2.0, which was mentioned as a benchmark in some research (Schmitt, 2010) and to be 
semantically semi-transparent. The research team chose the semi-transparent collocations based on 
their native speaker intuitions and then two objective eyes were asked to confirm the semantic 
opaqueness of the items. Once the items that were semi-transparent were chosen, they were cross¬ 
checked with their Turkish counterparts on TNC to make sure they had an Ml score of at least 3.0. 
Together with the MI score, which has its weaknesses like any other association measures, f-score was 
also integrated into the study as a complementary frequency measure. The items were fine-tuned so 
that they had an Ml score of at least 3.0 and a t-score of at least 2.0, both in Turkish and English, to 
comply with the standard benchmark values in Schmitt (2010). Additionally, the research team made 
sure that the chosen items in Turkish and English had no case marking since it is believed by many 
prominent linguists (e.g. Hoey, 2005; Sinclair, 1991) that lemmatization tends to fail to reflect 
essential differences in collocational preferences between different forms of a lemma. The decision 
can also be attributed to Durrant’s (2014) findings indicating that the difference between lemmatized 
and non-lemmatized frequency values in terms of their correlation with the learner knowledge of 
collocations is vague. Additionally, the trial version of TNC didn’t allow for a part of speech search, 
which made the possible lemmatization goal hard to achieve. 

With regard to the type of collocations chosen for the current study, Verb+Noun (V+N) and 
Adjective+Noun (ADJ+N), which were investigated comprehensively by previous research as well 
(e.g. Siyanova&Schmitt, 2008; Fan, 2009; Barfield&Gyllstad, 2009; Wolter, 2006; Wolter&Gyllstad, 
2011 etc.), were selected for a specific and a unique puipose. This study is part of a larger, cross- 
linguistic (Turkish-English), study for which the fact that adjective-noun word order is similar 
between the two language but verb-noun word order is not will be important. Because the current 
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research was employed as a starting point for the follow-up cross-linguistic experiment, it adopted the 
same lexical items extracted and categorized for the cross-linguistic investigation so that the findings 
can be reliable and comparable. Furthermore, although the word order for V+N collocations in Turkish 
is the opposite of English language (N+V), the researcher preferred the English word order in the 
priming experiment to have a comparable data for the planned future study. For instance, when the 
collocation was i§ik tutmak - “shed light”, the prime word was tutmak - “shed” and the following 
tai'gct word was i§ik - “light” in the priming script. 

Taking into account all these factors, the research team came up with thirty ADJ+N and thirty V+N 
items with no case marking that were chosen strategically for cross-linguistic investigation puiposes to 
be used in another experiment and the same items were employed in the current experiment to be able 
to have comparable data in the end. 

Although it was not part of the item development procedure, another association measure, Delta P 
(AP) by Gries (2013) was integrated into final the analysis to test the possible bidirectional activation 
of collocational networks. (See the complete list of items in Appendix A and how MI, t and AP values 
were computed in Appendix B) 


3. Results 

Every subject confirmed that they had normal or corrected to normal vision. All the participants 
were right hand dominant except for a single subject who was dominant in both hands. Below is an 
overview of the participants’ biographical information, dexterity and vision. 

Table 2. Summary of participants’ biographical information 


GROUP 

Age 3 

Dexterity 

(R/L/B) 

Gender 

(M/F) 

Vision 

Turkish ONLY 

Mean: 

R 

40/97.6% 

F 

27/65.9% 

No serious 

(N=41) 

24.4 

B 

1/2.4% 

M 

14/34.1% 

issues 


a range=18-55 

Subjects took a digit span test before the experiment and everybody scored 6 or more (. Mean=7.5 , 
Range=6-9 ), which was regarded as sufficient for a normal short term verbal memory. Participants at 
Ankara University and other universities (Hacettepe, METU etc.) took the remote version of the 
priming test. They were asked to take the test in a silent environment where they can focus on the task 
only and nobody will interrupt them. Despite the fact that 41 subjects took the priming test, the results 
of twenty eight participants were deemed to be consistent and worth investigating further on the 
grounds that some participants had more than 20% error rate, which is considered as a threshold in 
research adopting priming paradigm (Jiang, 2000). Moreover, the response times faster than 200 
milliseconds, slower than 2000 milliseconds, and the items with more than 2.0 standard deviation were 
removed from the overall data in an attempt to adhere to the priming paradigm standards. It is 
commonly though that a language user cannot decide if a lexical item is a word or not in less than 200 
milliseconds in a lexical decision task and if he/she does, it means he/she is not paying attention to the 
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task, which makes the results unreliable. If the participants are spending more than 2000 milliseconds 
processing a lexical item, that could indicate a strategic attempt, which needs to be avoided in 
experiments aiming for automatic priming effect. The results of the monolingual Turkish priming 
experiment are displayed in Figure 1 as an overview and in more details in Table 3. 


Mean Response Times 


Non-collocation 


Collocation 


540 545 550 555 560 565 570 575 580 585 590 595 



Collocation 

Non-collocation 

ft As a whole 

567,4 

585,5 

■ ADJ+N 

575,5 

588,5 

■ V+N 

559,3 

582,5 



Figure 1. Mean response times in milliseconds general view 


Table 3. Mean response times in milliseconds, standard deviations in parenthesis and error rates in square 

brackets 


Number of 

lexical items 

Collocation 

RT 

Non-collocates 

RT 

Priming Effect 

60 items 

(120 total) 

As a whole 
567.4(40.14) 
[1.52%] 

As a whole 

585.5 (38.89) 
[1.46%] 

18.1 

*p=.001, r=.41 

30 items 

(60 total) 

V+N 

559.3 (33.54) 
[1.84%] 

V+N 

582.5 (34.92) 
[1.23%] 

23.2 

p=.009, r=.46 

30 items 

(60 total) 

ADJ+N 

575.5 (44.91) 
[1.24%] 

ADJ+N 

588.5 (42.89) 
[1.7%] 

13.0 

*p=.05, r=.36 


*The significance level is .05 

Based on the difference between the mean response times of the collocate and non-collocate items, 
it can be deduced that the stimuli primes the target if the word combination is a collocation, which 
indicates eventually that collocational priming appears to exist in the Turkish language. The priming 
effect for each condition is statistically significant at the level of p <.05. Although the priming effect 
in the ADJ+N group is also significant, it appears that participants responded to the lexical items faster 






























474 


Hakan Cangir et al. / Journal of Language and Linguistic Studies, 13(2) (2017) 465-486 


if they are part of a V+N collocation, which resulted in a considerably stronger priming effect in this 
group. The possible reasons behind this fast processing will be discussed in the next section. 

As for the effect sizes of each category, when all the items were merged, the effect size of the 
priming effect was strong at the level of r=.41. However, when each part of speech group was 
analysed on its own, V+N collocations reflected a strong effect size of r=.46, whereas the ADJ+N 
collocations demonstrated a medium effect size ( r=.36). On the whole, when the mean response times 
of only the non-collocate items of each group are observed, it can be seen that there is not a big gap 
between them; however, when the collocate items are considered, one can conclude that the mean 
response durations are remarkably lower than the non-collocate ones and the V+N items were 
processed faster than the ADJ+N items by the Turkish participants. 

Another issue to note is that error rates for each category were low due to the fact that outliers were 
trimmed during the data categorization and analysis process, thus it can be claimed that the results 
seem to be relatively reliable in that participants paid enough attention to the experiment and the 
response times that are out of the priming paradigm standards have been eliminated. 

In an attempt to answer the second research question, a correlation and a regression analyses were 
conducted, the results of which could reveal a possible relationship between the dependent variable, 
mean response time and the association measures and part of speech exploited as independent 
variables. Furthermore, the regression analysis indicated the possible significant indicators of the mean 
response time in the priming experiment. 

The table below elucidates the significant correlations between the mean response times in the 
collocational priming experiment and the frequency values employed in the study. 

Table 4. Correlation Analysis Results 



Mean Response Times 

Collocation status 

-.224* 

Target word frequency 

-.346** 

t-score 

-.334** 

AP 112 

-.248** 

AP 2 ii 

-.199* 

MI score 

-.166* 


* Correlation is significant at the .01 level 
Correlation is significant at the .05 level 

It can be concluded based on the results of the correlation analysis that the mean response times of 
the lexical items seem to have significant inverse correlations with collocation status (?•=-. 224, p. 05), 
target word frequency (r=-.346, p. 01), f-score (r=-.334, p. 01), AP m 0--.248, p.01), AP 2 n 0--.199, 
p.05), and MI (/ =-. 166. p.()5) scores in Turkish. To be more precise, the inverse relations considering 
the negative correlations between the mean response times of the lexical items show that as the 
frequency values increase, the mean response durations decrease. That is to say, frequency can be 
regarded as a medium that facilitates collocational processing. All the frequency values presented in 
the table indicated a moderate correlation strength, whereas the AP 2 u, and Ml value revealed a weak 
correlation. 

The fact that there is a correlation between the mean response times and AP values in both 
directions is also worth underlining, which could mean that the effect of the prime word on the target 
is as important as the influence of the target word on the prime word; that is to say, the interaction of 
the lexical items in the mental lexicon may be bidirectional. 




Hakan Cangir et al. / Journal of Language and Linguistic Studies, 13(2) (2017) 465-486 


475 


Another obvious negative correlation can be seen in the variable, collocation status. As the analysis 
in the first part of this research revealed, if the presented lexical combination was a collocation, it led 
to a faster response time and the correlation results show a similar trend. Though the results should be 
treated cautiously, the correlations could indicate a possible effect of frequency on collocational 
priming in Turkish. Further research is needed to make strong claims about the reasons for the priming 
effect. 

In addition to the correlation analysis, which revealed some significant relationships between the 
mean response time and frequency values, a regression analysis was carried out in order to investigate 
the potential predictors of the mean response time in the priming experiment, which could yield some 
information regarding the partial effect of frequency on the priming effect and the processing of 
collocations in the mental lexicon. 

The table below shows the regression analysis results of the monolingual collocational priming 
experiment. 


Table 5. Regression Analysis Results 



B 

SEb 

Beta 

Model 

Constant 

619.920 

14.545 


POS 

14.387 

6.814 

.179* 

Target word frequency 

-23.308 

7.554 

-.285* 

f-score 

.643 

1.004 

-.085 

MI score 

3.832 

3.144 

.347 

AP 112 

-39.370 

26.384 

-217 

AP 2 I 1 

-23.963 

25.286 

-.131 


Note for model: /\’=.543 :i and R~=.2295 (p<.00 1 ) 

* The significance level is p<. 05 

The results of the regression showed the predictors explained 22.9% of the variance (R 2 =.229, 
F=4.76, /;<.()() I) for the model. It was found that part of speech significantly predicted the mean 
response time in the collocational priming experiment (P=-.179, p=.05). In addition, target word 
frequency revealed itself as another significant indicator of mean response time (P=-.-285, p=. 05). t- 
score can also be claimed to predict the mean response time in the priming experiment based on the 
regression results, but the p value does not allow to make strong claims. 

Overall, it can be stated that part of speech and target word frequency appear to influence the mean 
response time more than other variables indicating collocational frequency. The effect of part of 
speech can be deduced based on the numbers in the previous analysis showing faster processing in 
V+N collocations and a more robust priming effect in V+N word combinations than ADJ+N 
collocations. Therefore, one can assert that part of speech, target word frequency, and t-score (though 
tentatively) plays a partial role in how collocations are processed and appears to have an impact on 
collocational priming. Unlike the correlation analysis, which revealed the frequency measures, t-score, 
AP, and MI having a significant correlation with the mean response times, the regression analysis 
didn’t indicate a similar pattern for the predictors of mean response time in the priming experiment. 
This is likely to raise some issues regarding the claims made earlier about the priming effect; however, 
it must be underlined that the experiment was designed and the items were controlled in such a way 
that the participants saw different prime words but the same target words with regard to the 
collocational and non-collocational items, the mean response times of which were compared to find 
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proof for collocational priming. To be more precise, if the subjects saw the collocational item derin 
uyku - “deep sleep”, the non-collocational item whose response time was taken into account in the 
analysis was gizli uyku -'secret sleep”. That is to say, the target words were the same and the possible 
effect of the differing frequency between the words were eliminated. 

The explanations so far have addressed the LI Turkish subjects’ performance in the priming study, 
the priming effects observed, and the relationship between the frequency values and the response 
times. The last section will deal with the interpretation of the findings of the collocational priming 
study, regression and correlation analyses and the research team will attempt to explain the issue of 
collocational priming in Turkish by referring to a mental lexicon model. 


4. Discussion 

As Bybee (2005, p. 112) states “words used together fuse together”. In a similar vein, Hoey (2005) 
claims words are primed to co-occur and the activation of the node spreads to the collocate. This 
priming is asserted to be the basis of our creative language system. Investigating the reality of 
collocational priming in Turkish, the current study attempted to shed light on the effect of frequency 
on a possible priming effect in Turkish and approach the issue of mental lexicon organization from a 
syntagmatic perspective. 

The first overall conclusion that can be drawn based on the results of the priming experiment, 
regression and correlation analyses is that collocational priming seems to exist in Turkish for ADJ+N 
and V+N (though regular word order is N+V in Turkish) collocations with no case marking and 
frequency has an important impact on the lexical processing. As stated earlier, the lexical items were 
presented in V+N for a specific reason and the fact that there was a priming effect despite the irregular 
word order in Turkish presented in the priming experiment could be ascribed to the flexibility of 
Turkish in word order, particularly in spoken production. In other words, as opposed to the strict word 
order in English for V+N collocations, Turkish language users tend to switch between the two word 
order (N+V vs. V+N) frequently, though the written form (N+V) is strictly followed. Therefore, the 
facilitation of processing in spite of the irregular word order presentation could stem from this 
informal use. Another explanation could be that collocational priming in Turkish is bidirectional based 
on the significant correlations between the mean response time and the AP values in both directions. 

4.1. Regression Results 

According to the results of the regression, two significant predictors of the mean response time in 
the experiment were part of speech and target word frequency. The priming experiment revealed that 
the subjects of the study responded considerably faster to the V+N lexical items compared to the 
collocations in ADJ+N and the results of the regression indicating part of speech as a significant 
indicator of response duration seem to be in line with that finding. Though both part of speech 
categories reflected significant priming effects, the gap between the mean response times of V+N 
collocations and non-collocations (23.2 milliseconds) is comparatively bigger than the difference 
between the corresponding mean response times of ADJ+N combinations (13.0 milliseconds), leading 
to an assumption that nouns are processed faster when they are primed by a verb rather than an 
adjective in the Turkish language. 

There are some explanations in the literature regarding the faster response times of V+N 
collocational items than ADJ+N lexical combinations, though they are not conclusive and further 
evidence is needed. Approaching the issue from a generative perspective, Wolter and Gyllstad (2013) 
think that verbs are represented in higher nodes and as the head node in our internal grammar structure 
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mechanism, which could indicate they are processed first and faster than adjectives that are processed 
as an integral part of an adjectival phrase. This view has its roots in Generative Linguistics perspective 
and what the researchers assert is that this phenomenon could be also valid from a usage-based 
language approach. They further claim that faster V+N collocational processing is possibly due to the 
fact that verbs are entrenched as the most meaningful units of a constituent and because they are 
generally more concrete and salient, they bear stronger links with their neighbouring nouns. 

Another issue that needs to be emphasized is that the current research selected the lexical items 
with a specific puipose in mind, which was a cross-linguistic collocational priming experiment as the 
following step. The collocations exploited in the monolingual priming experiment were chosen among 
the lexical members with no case marking in order to avoid any misleading results. For instance, 
during trimming and frequency measuring process, the verb in the collocation kcircir vermek - “make a 
decision” was not lemmatized, and so forms like vermesi (3 rd person singular), vermen (2 nd person 
singular), vermeden (without making), etc. were ignored, which could have made a difference in the 
processing durations and the fact that no inflected forms were used might have resulted in faster 
response times for the collocational items. However, as the adjectives are not inflected in Turkish, the 
same situation might not have been possible for ADJ+N collocations, which could have resulted in the 
different response times between the two groups of word pairs. 

The second significant predictor of mean response time in the priming study was target word 
frequency and it needs further investigation. Although one may think that the effect of frequency of 
the tai'gct word on the lexical decision is an expected result, the fact that single word frequency is still 
playing a role while processing collocations, particularly when there is evidence that priming is 
occurring may mean more than the expected finding. To be more precise, it may mean that single 
word frequency is still helping with the processing of collocations as well as the collocational 
frequency. There is a common belief and empirical evidence that collocational items (formulaic 
phrases) in general are stored as chunks in the mental lexicon and when native speakers produce the 
language, they do not need to retrieve those lexical units separately because they are already activated 
as a whole, processed holistically and this is what facilitates spontaneous speech and how fluency is 
achieved (Schmitt, 2010). However, the results of this study show that not only the collocational 
frequency but also the frequency of the lexical items seem to be responsible for the speed of lexical 
processing. (Wray, 2012) summarizes some of the studies (e.g. Conklin & Schmitt, 2008) claiming a 
holistic storage of formulaic language. She questions the reasons of processing advantage and 
discusses the effect of repeated use on fused word strings before underlining the necessity to do 
interdisciplinary research for stronger evidence to answer all these questions. 

4.2. Correlation Results 

In addition to the regression results, the correlations computed to find possible relationships 
indicated that the mean response times and target word frequency as well as the association measures 
(f-score, AP in both directions, and MI) correlated negatively, which was interpreted as a clear 
indication that frequency is playing a critical role in how collocations are processed in Turkish. It may 
further be claimed that the more frequent a collocational item is, the stronger priming effect it has or in 
other words, the faster it is processed. 

Something that needs attention is the fact that one of the association measures exploited in this 
study, MI value, did not reflect a strong correlation although it was significant, which was at odds with 
some other research (e.g. Wolter and Yamashita, 2017). This finding itself could mean that due to its 
possible flaws, which were discussed in some earlier research, MI value as a frequency dimension by 
itself is not good at predicting collocational processing speed or there is a weak relationship between 
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the MI value, which measures effect size and sensitive to low frequency words, and collocational 
priming on the whole and the processing speed in a lexical decision task investigating collocations, in 
particular. As previous research also states the MI value is prone to mislead research results aiming at 
frequency as the core investigation and should be supported by other association measures, such as t- 
score (prioritizes adjusted frequency), AP (prioritizes directionality), log dice (prioritizes exclusivity) 
etc. to get a clearer picture (Gablasova, Brenzina and Mcenery, 2017). Another reason why the MI 
value did not reflect strong correlations could be the nature of the preferred lexical items. The fact that 
they were very commonly used word combinations in everyday language and consisted of very high 
frequency lexical members could have resulted in low MI scores, which might not have reflected the 
psychological reality of the collocations in terms of the participants’ own experiences. 

One last thing to discuss for the correlation analysis is the fact that AP value in both directions 
revealed significant negative correlations, though the AP 2 u one is weak, which could indicate a 
bidirectional relationship between the members of the collocational items and the mean response times 
in the lexical decision task of the priming experiment. In other words, the higher the AP values of the 
collocations for either direction were, the faster the participants responded to the lexical items and a 
stronger priming effect was observed. To exemplify, the effect of the word soguk - “cold” on the word 
sava§ - “war” was as important for the processing durations as the effect of the word “war” on the 
word “cold” in ADJ+N combinations and the same influence can be seen in V+N combinations, such 
as dikkcit - “attention” and etmek - “pay”. 


5. Conclusions 

On the whole, the priming effect observed based on the findings of this study seems to be in 
accordance with Hoey’s (2005) and Durrant and Doherty’s (2010) claims about collocational priming 
underlining the importance of frequency in collocational processing. The fact that Hoey’s findings are 
consolidated by means of a morphologically different language, Turkish, makes his remarks more 
reliable and generalizable. Further research taking case marking into account in Turkish is needed to 
draw stronger conclusions about agglutinative languages, though. 

As to a mental lexicon model accounting for the collocational priming phenomenon as well as 
semantic, orthographic and phonological aspects of lexical processing, The Spreading Activation 
Model (Collins and Loftus, 1975) can be seen as the best fitting framework emphasizing the activation 
of semantically related nodes as well as collocational items when a certain word is seen or heard by a 
language user. To be more precise, when a prime is presented (e.g. saganak-“ heavy”), the activation 
spreads to its collocate (yagmur-' rain”) and facilitates its processing as well as some semantically 
related items, such as “light”, “weight” etc. This spreading activation could be influenced by the 
salience and frequency of those single lexical items in addition to their collocational association 
strength. Salience and frequency are two important aspects of lexical processing underlined by 
cognitive linguists (Tomasello, 2003) as they play an important role in how entrenched single words or 
word combinations are in the mental lexicon and how often language users are exposed to them in 
their everyday life. 

Figure 2 shows a sample lexical organization network illustrating the spreading activation of 
semantically related and collocational items, which can be regarded as an extension to the Revised 
Spreading Activation Model by Bock and Levelt (1994). A similar cross-linguistic form of this model 
was proposed by Wolter and Yamashita (2014). Concepts are displayed in capital letters, whereas the 
lexical units are in small letters. Two-way arrows stand for possible bidirectional interaction and one¬ 
way arrows reflect the supposed direction of the lexical spreading. The activation of certain concepts 



Hakan Cangir et al. / Journal of Language and Linguistic Studies, 13(2) (2017) 465-486 


479 


is assumed to trigger the lexical items related to that concept (semantic or collocational in this case) 
together with the corresponding conceptual domains. The activation seems to take place both at the 
syntagmatic level as well as paradigmatic level in the proposed lexical organization framework and the 
strength of the links between the lexical units appear to be influenced by the frequency of the lexical 
units and the collocations. This must be seen as one layer of the lexical activation and access 
procedure. Different layers including phonetics, morphology and orthography can be added; however, 
they are not the main focus of the current research and needs to be addressed in a separate study. 

It should also be noted that the proposed framework is nothing more than an assumption based on 
the results of a single research study and more empirical studies are required for a generalizable and 
multi-layered depiction of the internal lexicon at the lexical activation and access level, in particular. 



Figure 2. Proposed Lexical Organization in the Mental Lexicon 

The framework proposed based on the assumptions of the current study needs further evidence to 
confirm collocational spreading activation by means of different cognitive methodologies, such as eye 
tracking (see Roberts and Siyanova-Chanturia, 2013; Carrol and Conklin, 2014 for a review on the use 
of eye-tracking to investigate lexical processing) and neuroimaging (see Henson, 2003 for a review of 
neuroimaging studies of priming). Until then, the idea of collocational spreading activation must be 
addressed tentatively. In addition, the issue of collocational priming, its psycholinguistic reality and its 
role in the organization of the internal lexicon, in particular needs further investigation from the 
glasses of morphologically different languages. This study focusing on the collocational priming and 
the effect of frequency on this phenomenon in Turkish could be regarded as a stepping-stone and aims 
to arouse more interest in lexical studies in Turkish. 
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6. Limitations and Suggestions for Further Research 

Initially, it must be stated that the lack of lemmatization can be seen as a flaw of this study since 
the integration of all the inflections of a verb or a noun in Turkish could indicate a more thorough 
analysis of the situation and it should be applied in future work. For instance, a collocation in Turkish 
like karar vermek-'make a decision” can have many forms depending on the subject of the sentence, 
for instance. It may take the form karanm vermek-'make his decision”, kcirar vermesi-'makmg a 
decision”, which could make a difference in the processing times of the word pairs in a priming 
experiment. In addition, if all the lemmas of each word are taken into account while measuring 
frequency, it is likely to reflect the overall effect of frequency on processing times from a different 
angle. Furthermore, different forms of a word could prime different lexical items. To exemplify, if the 
bare form okul-“ school” is used as a prime word, it is likely to prime a noun dnlugu-“ uniform” in 
Turkish. However, if the inflected form okula-“ to the school” is used, the verb gitmek-“go ” seems 
more likely to be primed. 

It should also be noted that lemmatization was omitted in this research mainly due to a lack of 
lemmatized search option in the Turkish National Corpus (TNC), which made the process of 
classification and integration of every inflected form challenging and time-consuming. The researcher 
had to make a decision owing to the time constraints. 

Furthermore, some methodological extensions can be considered. For instance, a different SOA 
(Stimulus Onset Asynchrony) may indicate alternative results and the comparison between the priming 
experiments with different SOAs can suggest important interpretations for automatic and strategic 
priming paradigms and certain underpinnings are already existent in the priming literature. To be more 
precise, considering layout of this experiment, 50 milliseconds rather than 100 milliseconds could 
have made a difference in terms of the priming effect. It would have been possible to claim that even 
under masked priming conditions, which is claimed to occur in 50 milliseconds or less (Altarriba and 
Basnight-Brown, 2007) there was collocational priming in Turkish. In future research, the results of 
collocational priming experiments with both SOAs can be compared to analyse the possible difference 
and explore the influence of prime word duration in collocational priming, if any. 

One of the extensions the writers of this research study are willing to make in their upcoming 
research is the inclusion of the lexical transparency into the regression model as a new and promising 
independent variable. 
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Appendix A 


A.l. VERB+NOUN 


Turkish 

V+N Collocations 

English 

Translations 

Turkish 

V+N 

Non-collocations 

Direct English 
translations 

hat a yapmcik 

make a mistake 

hcita almcik 

take mistake 

izin vermek 

give permission 

izin gitmek 

go permission 

key if aimak 

take pleasure 

keyif gormek 

see pleasure 

huzur bulmak 

find solace 

huzur bcikmcik 

look for solace 

yefkat gostermek 

show affection 

yefkat ogrenmek 

learn affection 

nefes aimak 

take breath 

nefes yapmcik 

make breath 

Qdzum bulmak 

find a solution 

gdzum bilmek 

know solution 

cinayet iylemek 

commit murder 

cinayet bagirmak 

shout murder 

oncelik vermek 

give priority 

oncelik gitmek 

go priority 

key if yapmcik 

make a discovery 

keyif aimak 

buy discovery 

ipucu bulmak 

find a clue 

ipucu bcikmcik 

look at clue 

kalp kirmak 

break heart 

kcilp silmek 

erase heart 

at eg agmak 

open fire 

atey tutmak 

keep fire 

zafer kazanmak 

win a victory 

zafer tutmak 

keep victory 

zaman gegirmek 

pass time 

zciman kurtarmak 

save time 

kcirar vermek 

make a decision 

kcirar gitmek 

go decision 

dikkcit etmek 

pay attention 

dikkcit yapmcik 

make attention 
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yiiphe uyandirmak 

cast doubt 

yiiphe kizdirmak 

annoy doubt 

iflcts etmek 

go bankrupt 

iflas olmak 

be bankrupt 

circi vermek 

take a break 

ara gormek 

see break 

ihtiyag duymak 

feel the need 

ihtiyag sormak 

ask need 

baski yapmak 

put pressure 

baski etmek 

do pressure 

kilo vermek 

lose weight 

kilo gormek 

see weight 

ziyaret etmek 

pay a visit 

ziyaret olmak 

be visit 

lyik tutmak 

shed light 

lyik koymak 

put light 

ornek olmak 

set an example 

ornek etmek 

do example 

sakal birakmak 

grow beard 

sakal goriiymek 

discuss beard 

kciza yapmak 

have an accident 

kciza etmek 

do accident 

vurgu yapmak 

place emphasis 

vurgu olmak 

be emphasis 

sir saklamak 

keep a secret 

sir gdtiirmek 

get secret 


A. 2. AD J+NO UN 


Turkish 

ADJ+N Collocations 

English Translations 

Turkish 

ADJ+N 

Non-collocations 

Direct English 
translations 

derin uyku 

deep sleep 

gizli uyku 

secret sleep 

soguk savay 

cold war 

uzak savay 

far war 

diy diinya 

outside world 

geg diinya 

late world 

kuvvetli delil 

strong evidence 

yiddetli delil 

heavy evidence 

giplak gdz 

naked eye 

yapay gdz 

artificial eye 

sicak karyilama 

warm welcome 

mevcut karyilama 

current welcome 

aci son 

bitter end 

hoy son 

nice end 

ateyli tartiyma 

heated debate 

yansli tartiyma 

lucky debate 

zengin tarih 

rich history 

sayili tarih 

limited history 

altin gag 

golden age 

kesin gag 

certain age 

orta simf 

middle class 

agir simf 

heavy class 

kar.pt gorily 

opposing view 

neyeli gorily 

happy view 

yiiksek mahkeme 

high court 

giizel mahkeme 

beautiful court 

dliimsiiz ayk 

undying love 

gelimsiz ayk 

thin love 

beyaz yalcm 

white he 

siyah yalcm 

black he 

agik fikir 

open mind 

temel fikir 

basic mind 

uzun vcide 

long run 

agik vade 

open run 

saganak yagm ur 

heavy rain 

gururlu yagmur 

proud rain 

yogun duman 

thick smoke 

hizli duman 

fast smoke 

kabank sag 

wiry hair 

endiyeli sag 

worried hair 

keskin koku 

strong smell 

pcirlcik koku 

shiny smell 

takma diy 

false tooth 

sisli diy 

foggy tooth 

koyu kahve 

strong coffee 

add kahve 

fair coffee 

alkolsiiz igki 

soft drink 

renksiz igki 

colorless drink 

itici giig 

driving force 

nazik giig 

kind force 

yiiksek bina 

tall building 

ciddi bina 

serious building 

biiyiik bayan 

high achievement 

dogru bayan 

correct achievement 

serf diiyiiy 

sharp fall 

ucuz diiyily 

cheap fall 

koklii degiyiklik 

drastic change 

kizgin degiyiklik 

annoyed change 

tarn yetki 

free rein 

az yetki 

few rein 
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Appendix B 

How t, MI, and Delta P scores are computed 

The formula which COCA employed to compute the MI score indicating how strongly related word 
pairs are is as follows; 

“MI = log((AB * sizeCorpus)/(A *B * span))/log(2)” 

AB = frequency of collocations (eg. "heavy" used in front of the noun "rain”) 
sizeCorpus = how big the corpus is (# word) 

A = frequency of node word (eg. "heavy") 

B = frequency of collocate (eg. "rain") 

span = span of words (note: 4 Left and 4 Right = 8 word span total was used) 
log(2) = the log 10 of the number 2 

The calculation indicates that the bigger the MI value, the stronger the relationship between the 
lexical items. As stated earlier, word pairs with 3.0 or higher MI were accepted as valid and included 
in the study since 3.0 is claimed enough to state that a word pair does not co-occur randomly (Durrant 
and Doherty, 2010). 

The other association measure is computed as follows: 
t-score — O-E 

Vo 

O: observed frequency of the collocation 
E: expected frequency of the collocation 

After the observed frequency is subtracted by the expected frequency, the result is divided by the 
standard deviation. Durrant and Doherty (2010) state 2.0 or higher t values show a statistically 
significant difference and is sufficient to claim that a word pair is a collocation. 

Gries (2013) thinks that directional measures of collocational frequency have some drawbacks and 
as he claims AP succeeds in addressing these flaws by normalizing conditional probabilities, which 
makes AP a psychologically and psycholinguistically realistic measure. 

The complementary association measure, AP included later in the study is computed as follows: 
AP 211 = p (word 2 1 word 1 = present) - p (word 2 1 word! = absent) = (a^a+b) - (co- c + d) 

APn 2 = p (word, I word 2 = present) - p (word! I word 2 = absent) = (a^a+c) - (b^ b + d) 

A sample calculation of AP is as follows: 


Co-occurrence of the word “of course” in the spoken component of British National Corpus 



course: present 

course: absent 

Totals 

of: present 

5610 

168.938 

174.548 

of: absent 

2257 

10.223.063 

10.235.320 

Totals 

7867 

10.402.001 

10.409.898 


AP 2 h = p (course lword 2 = of) - p (course |word 2 f of) = 5610 - 2257 ~ 0.032 

174548 10235320 

APn 2 = p (of lword 2 = course) - p (of |word 2 f course) = 5610 - 168938 ~ 0.697 

7867 10402001 
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The numbers could basically show that the word “course” is a better cue to “of’ than vice versa. 
Given that each association measure discussed so far has its plus and minuses, the current research 
included MI, t, and AP values in the analysis to explore a possible frequency effect in collocational 
priming. 


E$dizimli kelimelerde oncelemenin Tiirkge baglammda incelenmesi 


Oz 

Tekdillilerin zihin sozliigunun nasil §ekillendigini aipklamaya fali^an bir 50 k te§ebbiis olmu§tur ve §imdiye 
kadar onerilen her bir model kelime ifjlemleme siirecinin farkli bir boyutunu ele almi^tir. Bu modellerin ortak 
noktasi, betimlemelerinin e^dizimli kelimeler gibi kelime gruplanm goz ardi etmeleri ve yakla$imlarmda 
paradigmatic ilifjkilerin one gikmasidir. Hoey (2005) tarafindan ortaya atilan Kelimelerde Onceleme Teorisi, 
e§dizimli kelimelerin zihin sozliigiinde i$lemlenmesine ve bu i^lemlenmenin yaratici dil iiretimimiz frin olan 
onemine bili§sel ve psikodilbilimsel afidan i§ik tutmaya (jah^maktadir. Birgok psikodilbilimsel ara^tirma 
Hoey’in teorisini ingiliz dili baglammda test etmiqir. Mevcut araqirma ise bu alanda yapilan fah^malarm 
kapsammi genifjletmi§ ve e$dizimli kelimelerde onceleme olgusunu Turk dili baglammda incelcmiqir. Ayrica, 
sikligm ve sozciik turiiniin tarti^ilan srircqtcki muhtemel etkisi, onceleme deneyindeki sozciik karar verme 
siireleri ve soz konusu bagimsiz degi^kenler arasindaki ilifjki incelenelerek mercek altma alinmi^tir. Hoey’in 
iddialarmi dogrular nitelikte olan bulgular, Turkic anadil konu§uculari frin onemli bir onceleme etkisini i§aret 
etmektedir. Regresyon analizi gbstermi^tir ki, siklik ve sozciik tiirii ifjlemleme siiresinin onemli bir kestiricisidir. 
Son olarak, sozciik karar siireleri ile e§ dizimli kelimelerde siklik arasmda giiqIii bir korelasyon tespit edilmiqir. 
Araqirmanin bulgularma dayanarak miitevazi bir zihin sozliigii modeli ortaya konnuqtur. 


Anahtar sozciikler: zihin sozliigii; e$dizimli kelimelerde onceleme; siklik 
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